Characteristics of scientific Web publications: Preliminary data gathering and analysis

نویسندگان

  • Erik Thorlund Jepsen
  • Piet Seiden
  • Peter Ingwersen
  • Lennart Björneborn
  • Pia Borlund
چکیده

of the increasing presence of scientific publications on the Web, combined with the existing difficulties in easily verifying and retrieving these publications, research on techniques and methods for retrieval of scientific Web publications is called for. In this article, we report on the initial steps taken toward the construction of a test collection of scientific Web publications within the subject domain of plant biology. The steps reported are those of data gathering and data analysis aiming at identifying characteristics of scientific Web publications. The data used in this article were generated based on specifically selected domain topics that are searched for in three publicly accessible search engines (Google, AllTheWeb, and AltaVista). A sample of the retrieved hits was analyzed with regard to how various publication attributes correlated with the scientific quality of the content and whether this information could be employed to harvest, filter, and rank Web publications. The attributes analyzed were inlinks, outlinks, bibliographic references, file format, language, search engine overlap, structural position (according to site structure), and the occurrence of various types of metadata. As could be expected, the ranked output differs between the three search engines. Apparently, this is caused by differences in ranking algorithms rather than the databases themselves. In fact, because scientific Web content in this subject domain receives few inlinks, both AltaVista and AllTheWeb retrieved a higher degree of accessible scientific content than Google. Because of the search engine cutoffs of accessible URLs, the feasibility of using search engine output for Web content analysis is also discussed. Introduction The Web has a significant impact on the practice in scientific publication. According to Cronin and McKim (1996, p. 170), the Web is reshaping the ways in which scholars communicate with one another, i.e., new kinds of scholarly and proto-scholarly publishing are emerging, which means that work-in-progress, broadsides, early drafts, and refereed articles are now almost immediately sharable. Nevertheless, only a minor part of these scientific publications is accessible through the Web, because they are difficult to control (i.e., to find, identify, access, and assess). In February 1999 Lawrence and Giles (1999, p. 107) estimated that only 6% of randomly selected Web sites contained scientific or educational content, defined as university, college, and research lab servers. It is an open question whether content of a scientific nature should solely be found in those domains. In response to this question, Björneborn and Ingwersen (2001, p. 69) explain …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iran’s Scientific Publications in the Field of Endocrinology and Metabolism in the Web of Science: A Scientometric Analysis

Introduction: Scientific publications are among the most important indicators for evaluating the development of scientific fields in different countries. The present study aimed to evaluate Iranian scientific publications in the field of endocrinology and metabolism. Materials and Methods: Documents for analysis included all Iranian scientific publications in the field of endocrinology and meta...

متن کامل

میزان همکاری‌های علمی دانشگاه‌های علوم پزشکی تیپ یک در سطح ملی و بین‌المللی براساس مدارک نمایه‌شده در پایگاه ISI بین سال‌های 2004 تا 2008

Purpose: To investigate the scientific collaborations (for the papers indexed in ISI web of knowledge) among researchers from type-1 universities of medical sciences during 2004-2008, the present study was conducted. Methodology: Webometrics based upon the co-authorship index was used. The population under study included the Tehran, Shahid Beheshti, Iran, Shiraz, Isfahan, Mashhad, Tabriz, Jond...

متن کامل

Kharazmi University Scientific Publications and Co-authorship Networks in Web of Science (1994-2020(

Background: The performance and collaboration of universities can be measured through scientific publications and scientometrics indicators. The purpose of this article is to describe the scientific publications situation of Kharazmi University and to discover the important actors of the Co-authorship networks of this university at three levels of researchers, organizations and countries in the...

متن کامل

The Status of Medical Education Studies by Iranian Researchers Among the Educational Publications Indexed in Web of Science

  Introduction: Medical education is a broad field of study that, as a subset of educational research, examines inputs, processes and outputs associated with teaching and learning medical sciences. The purpose of this study was to investigate the status of medical education studies by Iranian researchers among the educational publications indexed in Web of Science from 1990 to 2015. Methods: Th...

متن کامل

Scientific Outputs on Cyberchondria: scientometric, Altmetric and Researchers’ Sciencitifc Collaboration Network Analysis

Introduction:Cyberchondria can be considered one of the emerging challenges in the age of the Internet. The aim of this study was to to investigate the scientific productivity and analyze the collaboration network of authors in the cyberchondria field in the Web of Science (WoS). Methods:To conduct this research, scientometric, altmetrics and social network analysis indicators were utilized. A...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 55  شماره 

صفحات  -

تاریخ انتشار 2004